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PREt'Ai'l- 


The  Pattern  Analysis  Branch,  Mapping,  Charting  and  Geodesy  (MCAG) 
Division,  of  the  Naval  Ocean  Research  and  Development  Activity  (NORDA) 
has  been  Involved  over  the  past  several  years  In  the  development  of 
algorithms  and  techniques  for  computer  recognition  of  free-form 
handprinted  symbols  as  they  appear  on  the  Defense  Mapping  Agency  (DMA) 
maps  and  charts.  NORDA  has  made  significant  contributions  to  the 
automation  of  MC&G  through  advancing  the  state  of  the  art  in  such 
information  extraction  techniques.  In  particular,  new  concepts  in 
character  (symbol)  skeletonization,  rugged  feature  measurements,  and 
expert  system-oriented  decision  logic  have  allowed  the  development  of 
a  very  high  performance  Handprinted  Symbol  Recognition  (HSR)  system 
for  identifying  depth  soundings  from  naval  smooth  sheets  (accuracies 
greater  than  99.5%). 

The  study  reported  in  this  technical  note  is  part  of  NORDA 's  continu¬ 
ing  research  and  development  in  pattern  and  shape  analysis  as  it 
applies  to  Navy  and  DMA  ocean/envi ronment  problems.  The  issue  addres¬ 
sed  in  this  technical  note  deals  with  emerging  areas  of  syntactic  and 
semantic  techniques  in  pattern  recognition  as  they  might  apply  to  the 
free-form  symbol  problem. ^-The  author  was  asked  to  review  these  power¬ 
ful  tools  in  light  of  his  earlier  support  to  the  Pattern  Analysis 
Laboratory  [1]  and  to  analyze  their  potential  for  extending  the  HSR 
system  to  a  wider  range  of  symbols.  These  results  contribute  to  the 
overall  NORDA  RAD  effort  to  investigate  and  develop  methods  for  more 
precise  geometric  shape  descriptions  ror  application  to  Ocean  Science 
Information  Extraction  (OSIS)  problems. 
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I.  INTRODUCTION 


This  report  discusses  a  character  recognition  approach  based  on  the 
use  of  syntactic/semantic  concepts.  This  approach  is  consistent  with 
our  earlier  work  for  NORDA  in  the  sense  that  it  is  based  on  the  same 
philosophy  and  types  of  features  that  were  recommended  in  an  earlier 
report  [1]  and  which  have  been  investigated  since  then.  The  material  in 
the  following  sections  unifies  these  recommendations  and  includes 
extensions  such  as  techniques  for  handling  interconnections  between  fea¬ 
tures,  the  recognition  of  feature  strings  by  syntactic  methods,  and  the 
use  of  semantics  for  quality  assurance  both  in  the  computation  of 
features  and  in  the  recognition  stage.  The  methods  described  in  this 
report  are  intended  as  a  complement  to  the  techniques  presently  being 
used  in  the  NORDA  OCR  system,  and  as  a  potential  tool  for  handling  forth¬ 
coming  problems  in  alphanumeric  character  recognition. 

The  structure  of  the  proposed  approach  is  shown  diagramatically  in 
Fig.  1.  It  is  assumed  that  the  input  to  the  system  is  a  skeleton  of 
the  character  to  be  recognized.  The  selection  of  a  skeleton  input  is 
consistent  with  the  processing  capabilities  of  the  present  OCR  system. 

A  skeleton  representation  also  has  the  advantage  that  it  facilitates  the 
computation  of  features  such  as  bays,  lakes,  and  branch  points,  which  have 
been  deemed  essential  for  rugged  character  representations.  It  is  noted, 
however,  that  the  methods  discussed  in  the  following  sections  could 

easily  be  modified  to  accept  character  outline  (border)  inputs. 

The  feature  extraction  and  attribute  assignment  stage  has  the 
function  of  computing  and  quantifying  all  the  features  required  for 
recognition.  As  explained  in  more  detail  in  Section  2,  this  stage  is 
based  on  a  hierarchical ,  semantic-guided  approach.  The  function  of  the 
screening  stage  is  to  select  a  particular  set  of  recognition  modules  to 
process  a  given  input.  The  basic  idea  is  that,  at  this  point  in  the  overall 
process,  the  features  detected  in  a  character  can  be  used  to  advantage 
in  guiding  the  recognition  strategy  to  be  applied  to  the  input.  The 
pre-selection  of  a  subset  of  recognition  procedures  not  only  simplifies 
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Figure  1.  Syntactic/semantic  character  processor. 
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the  organization  of  the  classifier,  but  has  the  added  advantage  of 
operational  efficiency.  The  recognizer  is  based  on  the  use  of  syntactic/ 
semantic  techniques.  As  described  in  Section  3,  the  syntax  establishes 
the  structure  of  the  pattern  classes  under  consideration,  while  the 
semantics  establish  the  meaning  or  validity  of  a  particular  pattern  in 
the  context  of  that  structure. 

One  of  the  most  important  aspects  in  the  selection  of  any  pattern 
recognition  approach  is  the  availability  of  learning  algorithms. 

The  techniques  discussed  in  this  report  are  formulated  to  take  advantage 
of  one  of  the  most  powerful  learning  techniques  available  for  syntactic 
systems.  As  discussed  in  Section  4,  the  proposed  learning  algorithm 
depends  on  only  one  user-specified  parameter,  and  the  behavior  of  the 
procedure  as  a  function  of  this  parameter  is  well  understood  and  easily 
analyzed. 

Another  important  advantage  of  the  approach  discussed  in  the 
following  sections  is  that  it  includes  a  procedure  for  checking  class 
separability.  Given  the  recognizers  learned  from  a  set  of  training 
pattern  classes,  the  procedure  discussed  in  Section  5  identifies  the 
patterns  that  cannot  be  classified  into  a  unique  class,  thus  yielding 
information  related  to  the  discriminatory  power  of  the  features  used  in 
the  system,  and  the  structure  of  the  patterns  in  the  overlapping  regions. 

Although  as  indicated  above,  learning  algorithms  already  exist  for 
the  syntactic  components  of  the  character  processor,  no  such  algorithms 
are  yet  known  for  the  semantics.  The  material  in  Section  6  addresses  this 
problem  from  an  interactive  point  of  view  which  utilizes  automatic 
learning  for  the  syntax  and  user-defined  rules  to  establish  the  correspond¬ 
ing  semantic  components  of  the  system. 


4 


II.  FEATURE  EXTRACTION  AND  ATTRIBUTE  ASSIGNMENT 
2.1  Background 

In  this  section  we  discuss  a  hierarchical ,  semantics-based  approach 
for  feature  extraction,  as  well  as  the  assignment  of  attributes  to  those 
features. 

The  basic  approach  is  shown  diagramatical ly  in  Fig.  2.  For  features 
at  level  k  of  the  hierarchy,  we  consider  a  structural  description  of 
the  form 


Level  k/features/attributes/primiti ves/semanti c  rules 

The  hierarchical  nature  of  the  method  implies  that  the  procedure  starts 
with  simple  primitives  and  successively  builds  more  complex  features  from 
them.  It  is  noted  that  what  we  call  primitives  in  the  computation  of 
a  feature  at  level  k  may  be  features  that  have  been  computed  at  levels 
lower  than  k.  This  terminology  is  used  for  consistency  in  the  structural 
description  given  above. 

The  attributes  are  used  for  characterizing  each  feature  with  descrip 
tors  such  as  length,  orientation,  and  location  of  its  centroid.  The 
use  of  semantics  allows  quality  control  of  the  features  generated  at 
all  levels  of  the  hierarchy.  The  approach  is  to  use  semantic  rules  in 
order  to  guarantee  that  all  features  used  for  subsequent  recognition  are 
meaningful  in  the  context  of  character  recognition. 

Level  zero  of  the  hierarchy  consists  of  the  input  data  to  the  recog¬ 
nition  system  (e.g.,  character  skeletons).  The  function  of  the  other 
levels  is  explained  in  the  following  sections. 

2.2  Level  One:  End  Points 

Level  1  of  the  hierarchy  extracts  all  end  points  in  a  given  skeleton 
The  structural  description  is 

Level  1/end  point/attributes/primitives/semantic  rules 
where  the  elements  of  the  description  are  as  follows: 
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Figure  2.  Hierarchical,  semantic-guided  feature  extraction 
and  attribute  assignment. 
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Attributes 

The  only  attribute  assigned  to  an  end  point  is  its  location. 

Primitives 

The  primitive  of  an  end  point  is  a  single  pixel  satisfying  the  follow 
ing  rule: 

Semantic  Rule 

R^(l)  =  TRUE  for  any  pixel  with  exactly  one  m-neighbor. 

Thus,  all  pixels  for  which  R^{1)  is  TRUE  are  labeled  as  end-point  features 

2.3  Level  Two:  Branch  Points 

Level  2  of  the  hierarchy  extracts  branch  points  using  the  following 
description 

Level  2/branch  point/attributes/primitives/semantic  rules 
where  the  elements  of  the  description  are  as  follows: 

Attributes 

The  attributes  assigned  to  a  branch  point  are  its  location  and  number 
of  branches  attached  to  it. 

Primi tives 

The  only  primitive  of  a  branch  point  is  a  single  pixel  satisfying 
the  following  rule: 

Semantic  rule 

1^(2)  =  TRUE  for  any  pixel  with  more  than  two,  and  less  than  T^(2), 
m-neighbors,  where  T1(2)  is  a  threshold  (e.g.,  four). 

All  pixels  in  an  input  skeleton  for  which  R^(2)  is  TRUE  are  labeled  as 
branch-point  features. 

2.4  Level  Three:  Arcs 

Arc  features  have  the  structural  description 

Level  4/arc/attributes/primltives/semantic  rules 
where  the  elements  of  the  description  are  as  follows: 


Attributes 

The  attributes  assigned  to  an  arc  are  the  '•ocation  of  its  two  termi¬ 
nator  points  and  its  length.  A  terminator  point  in  this  case  is  either 
an  end  point  or  a  branch  point, 1  and  the  arc  length  is  the  checker-board 
distance  between  the  terminator  points. 

Primitives 

The  primitives  of  an  arc  are  the  set  of  pixels  satisfying  the  follow¬ 
ing  semantic  rules: 

Semantic  Rules 

R-j(3)  -  TRUE  if  only  two  distinct  pixels  are  terminator  points. 

R2 ( 3)  =  TRUE  if  there  is  only  one  set  of  pixels,  each  pixel  having 
exactly  two  m-neighbors,  and  lying  between  the  terminator 
points  identified  in  R.j(3). 

An  arc  feature  is  then  a  set  of  pixels  for  which  R-j  ( 3)  TT R2( 3)  =  TRUE. 
2.5  Level  Four:  Lakes 

The  structural  description  for  lakes  has  the  form 
Level  4/lake/attributes/primitives/semantic  rules 
Attributes 

The  attributes  assigned  to  a  lake  feature  are:  (1)  the  location  of 
its  centroid:  (2)  the  error  of  its  least-square-error  elliptical  fit  (see 
Appendix  A);  (3)  the  direction  of  its  principal  axes,  d^(4)  and  d^(4); 

(4)  the  variance  (spread)  along  each  principal  axis,  v^ ( 4 )  and  v2(4);  and 

(5)  the  location  of  any  branch  points  along  the  boundary. 

Primitives 

The  primitives  of  a  lake  are  pixels  satisfying  the  following  semantic 
rules: 

+ 

More  generally,  a  terminator  point  is  any  pixel  that  denotes  the  end  of  a 
feature.  The  feature  may  be  embedded  between  two  other  features,  in 
which  case  both  terminator  points  could  be  internal  pixels. 


Semantic  rules 

R-j(4)  =  TRUE  if  there  is  a  set  of  m-connected  pixels  (including 
branch  points)  forming  a  closed  boundary. 

R^(4)  =  TRUE  if  the  elliptical  fit  error  is  less  than  a  threshold 
T-j  ( 4) . 

R3(4)  =  TRUE  if  the  ratio  ( 1 ) / Vg( 1 )  is  less  than  a  threshold 
T2(4),  where  it  is  assumed  that  v^(l)  _>  V2O). 

Thus,  a  lake  feature  is  a  set  of  pixels  for  which  R-|  (4)  n R2(4)  n R3(4) 
TRUE.  It  is  noted  that  rule  R^ ( 4 )  establishes  a  lake  in  the  general  sense 
that  it  refers  to  a  closed  boundary.  Rules  R2(4)  and  R3(4),  however, 
further  refine  this  concept  by  establishing  a  val i d  lake  shape  for  the 
purpose  of  character  recognition. 

2.6  Level  Five:  Polygonal  Segments 

The  features  discussed  in  this  and  the  following  three  sections  deal 
with  the  characterization  of  arcs.  The  first  step  is  to  approximate  a 
given  arc  by  a  set  of  connected  polygonal  segments  using  the  structural 
description 

Level  5/polygonal  segment/attributes/primitives/semantic  rules 
The  elements  of  this  description  are  as  follows: 

Attributes 

The  attributes  used  for  each  polygonal  segment  are:  (1)  length,  (2) 
direction,  (3)  location  of  terminator  points,  (4)  location  of  centroid 
( i . e . ,  midpoint),  and  (5)  approximation  error. 

Primitives 

The  primitives  of  polygonal  segments  are  the  pixels  in  a  given  arc. 
Semantic  rules 

R^S)  =  TRUE  if  the  mean-squared  error  between  a  polygonal  segment 

and  its  corresponding  arc  is  less  than  a  threshold  T^(5). 

R2 ( 5 )  =  TRUE  if  the  number  of  polygonal  segments  satisfying  R^(5) 

is  less  than  a  threshold  Tg^). 
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We  say  that  a  polygonal  approximation  of  a  given  arc  is  valid  if  R-|(5)n 
R2(5)  =  TRUE.  Semantic  rule  R^ (5)  establishes,  by  means  of  T^(5),  an 
acceptable  approximation  in  a  mean-squared-error  sense.  Since  it  is 
always  possible  to  make  all  errors  arbitrarily  small  (the  limiting  case 
is  zero  by  using  n  -  1  segments,  where  n  is  the  number  of  pixels), 
semantic  rule  R^( 5 )  is  used  as  an  "irregularity  filter."  That  is,  pre¬ 
selecting  the  maximum  number  of  polygonal  segments  that  are  allowed 
eliminates  as  unacceptable  irregular  arcs  that  require  a  greater  number 
of  segments  in  order  to  satisfy  the  error  criterion  in  Rule  R-j  ( 5 ) . 

The  direction  attribute  of  each  polygonal  segment  is  quantized  into 
one  of  eight  possible  directions,  as  shown  in  Fig.  3.  Since  two 
directions  differing  by  180°  are  possible  for  each  segment,  the  ambiguity 
is  resolved  by  assuming  a  standard  clockwise,  up-down  scan  of  the  poly¬ 
gonal  structure. 

2.7  Level  Six:  Straight-Line  Segments 

At  this  level  in  the  hierarchy  we  consider  straight-line  segments 
(SLS's)  which  are  the  least  complex  features  that  can  be  formed  using 
polygonal  segments  as  primitives.  The  structural  description  is  as 
follows: 


Level  6/SLS/attributes/primitives/semantic  rules 
Attributes 

The  attributes  of  each  SLS  are:  (1)  length,  (2)  direction,  (3)  loca¬ 
tion  of  its  centroid,  and  (4)  location  of  its  terminator  points.  Although, 
as  will  be  seen  below,  an  SLS  may  be  composed  of  a  series  of  polygonal 
segments,  the  length  of  an  SLS  feature  is  defined  as  the  Euclidean 
distance  between  its  terminator  points,  its  direction  is  defined  as  the 
direction  of  a  line  passing  through  these  two  points,  and  its  centroid 
is  simply  the  midpoint.  The  direction  attribute  is  encoded  using  the 
approach  indicated  in  the  previous  section. 

'Primitives 

The  primitives  of  an  SLS  are  polygonal  segments  satisfying  the  follow¬ 
ing  semantic  rules: 

\ 

i: 


7?  >»  -  -v-  - 
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Semantic  Rules 

R.j(6)  =  TRUE  if  two  contiguous  polygonal  segments  have  an  interior 
angle  greater  than  a  threshold  T^(6). 

Rg(6)  =  TRUE  if  a  single  polygonal  segment  has  length  greater  than 
T2(6)*L  where  L  is  the  sum  of  the  lengths  of  all  the 
polygonal  segments  and  T^ ( 6 )  is  a  constant  less  than  one. 

Rg(6)  =  TRUE  if  there  is  only  one  polygonal  segment. 

Rule  R-j(6)  is  applied  recursively  and  any  segments  for  which  (6)  u 

R3(6)  =  TRUE  are  classified  as  SLS's. 

2.8  Level  Seven;  Corners 

Corners*  have  the  structural  description 

Level  7/corner/attributes/primitives/semantic  rules 
Attributes 

The  attributes  are:  (1)  angle,  (2)  depth,  (3)  width,  (4)  area, 

(5)  direction,  (6)  length  of  sides,  (7)  location  of  the  terminator  points, 
(8)  location  of  the  centroid,  (9)  convexity,  and  (10)  concavity.  The 
meaning  of  these  attributes  may  be  explained  with  the  aid  of  Fig.  4.  The 
angle  of  a  corner  is  defined  to  be  the  interior  angle  formed  by  the  two 
s.des.  If  we  treat  the  corner  as  a  triangle,  as  shown  in  Fig.  4,  the 
width  is  defined  as  the  length  of  the  base  of  the  triangle,  while  the 
depth  is  the  length  of  its  altitude.  The  area  is  the  area  of  the 
triangle.  The  direction  of  the  corner  (quantized  as  in  Fig.  3)  is  given 
by  the  direction  of  the  altitude  segment  directed  from  the  corner  point 
to  the  base.  The  centroid  of  a  corner  is  the  average  of  the  centroids 
of  its  sides.  To  establish  whether  a  corner  is  convex  or  concave  we 
consider  a  traveler  traversing  the  corner  in  a  clockwise  up-down  manner. 

If  the  base  of  the  triangle  lies  to  the  travelers  right  hand,  the  corner 
is  convex;  otherwise  it  is  concave. 


+Thi$  classification  of  corners  is  not  related  to  the  measures  of 
cornericity  discussed  in  Appendix  B. 


A 


13 


Primitives 

The  primitives  of  corners  are  SLS's  satisfying  the  following  semantic 
rules : 

Semantic  rules 

R1 ( 7)  =  TRUE  if  R3(6)  =  FALSE 

R2 ( 7 )  =  TRUE  if  there  remain  two  or  more  contiguous  SLS's  after  the 
recursive  application  of  rule  R3 (6) . 

If  R-j  ( 7 )  T’i R2 ( 7 )  =  TRUE  we  have  the  condition  for  at  least  one  corner. 
If  there  are  more  than  two  contiguous  SLS's,  they  are  considered  pairwise, 
each  contiguous  pair  forming  a  corner  and  possibly  sharing  sides  with 
other  contiguous  corners. 

2.9  Level  Eight:  Bays 

The  final  features  computed  by  the  hierarchical  feature  extractor 
are  bays,  which  have  the  structural  description 

Level  8/bay/attributes/primiti ves/semantic  rules 
Attributes 

The  attributes  of  a  bay  are:  (1)  opening,  (2)  area,  (3)  direction 
(4)  degree,  (5)  height-to-width  ratio  (H/W),  (6)  length,  (7)  location 
of  the  terminator  points,  (8)  location  of  the  centroid,  (9)  convexity, 
and  (10)  concavity.  The  opening  attribute  is  simply  the  Euclidean  dis¬ 
tance  between  the  two  terminator  points.  The  area  is  the  sum  of  the  areas 
of  the  corners  forming  the  bay  (see  below).  The  direction  of  a  bay  is 
the  average  of  the  directions  of  the  corners.  The  degree  of  a  bay  is 
defined  as  the  number  of  corners  of  which  it  is  composed.  The  attribute 
H/W  is  the  height-to-width  ratio  of  a  bounding  box  (the  orientation  of 
the  box  could  be  along  the  principal  eigen  axes  or  simply  along  the  x-y 
extremes).  The  length  of  a  bay  is  equal  to  the  sum  of  the  lengths  of 
the  sides  of  the  corners.  The  centroid  is  the  average  of  the  corner 
centroids.  Finally,  a  convex  (concave)  bay  is  formed  by  two  or  more 
convex  (concave)  corners. 
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Primitives 

The  primitives  of  a  bay  are  corners  satisfying  the  following  semantic 

rule : 

Semantic  rule 

(8)  =  TRUE  if  there  are  two  or  more  convex  (concave)  contiguous 
corners. 

Thus,  the  presence  of  a  bay  is  established  simply  by  combining  corner 
features  which  are  contiguous  and  have  the  same  convexity  or  concavity 
attribute. 

The  material  in  Section  2.2  through  2.9  is  summarized  in  Table  1 
and  illustrated  in  the  following  example. 

Example 

The  concepts  discussed  thus  far  are  illustrated  in  Fig.  5.  The  input 
to  the  hierarchical  feature  extractor  is  shown  in  Fig.  5(a),  and  the 
result  of  the  first  level  of  processing  is  to  attach  the  labels  E-|  and  E^ 
to  the  two  end  points  in  the  character,  as  shown  in  Fig.  5(b).  Level  2 
produces  a  null  output  (there  are  no  branch  points),  while  Level  3  identi¬ 
fies  arc^  and  terminator  points  t^  and  t2>  It  is  noted  that  t1  and  t2 
are  simply  the  end  points  found  in  Level  1.  Since  there  are  no  lakes  in 
the  character,  Level  4  produces  a  null  output.  Level  5  produces  polygonal 
segments  s^  through  Sg,  as  shown  in  Fig.  5(d).  Note  the  introduction  of 
terminator  points  t3  through  tg  used  to  denote  the  ends  of  the  polygonal 
segments.  Level  6  has  the  output  shown  in  Fig.  5(e).  Polygonal  segments 
s3  an<*  s4  were  combir,ecl  1°  this  case  into  straight-line  segment  SLS3  and 
consequently,  terminator  point  tg  is  no  longer  of  interest.  Level  7 
produces  corners  c^ ,  c2,  and  Cj,  along  with  their  terminator  points.  It 
is  noted  that  t^  is  a  terminator  point  for  both  c^  and  c3  and  that  c2  has 
terminator  points  t-j  and  tg.  Finally,  the  output  of  Level  7  is  shown  in 
Fig.  5(g).  It  combined  corners  c^  and  c2  into  bay^  with  terminator 
points  t^  and  tg.  Thus,  the  highest-level  description  of  the  input 
character  consists  of  bay^  followed  by  corner^. 

It  is  important  to  note  that  all  the  Information  computed  at  a  given 
level  is  available  to  all  higher  levels.  Thus,  the  descriptors  associated 


where  l  Is  the  sum  of  the  lengths 
of  all  the  polygonal  segments  and 
T-(6)  <  1. 

Rj(6)  «  TRUE  If  there  Is  only  one  polygonal 
segment. 


with  the  location  of  terminator  points,  length  and  direction  of  line 
segments,  etc.,  are  implicitly  available  to  further  refine  the  final 
features  for  the  purpose  of  establishing  their  semantic  validity. 


2.10  Coding  of  Feature  Strings 

Once  individual  features  have  been  extracted  and  their  attributes 


computed,  the  next  processing  step  is  to  organize  the  features  in  the 
form  of  a  string  suitable  for  the  syntactic/semantic  recognizer.  A  set 
of  m  features  extracted  from  a  given  input  character  will  be  represented 
by  the  string  notation 


where  each  f.  represents  any  of  the  features  obtained  by  the  procedures 
discussed  in  Sections  2.2  through  2.9. 

As  discussed  in  Section  3,  the  basic  structure  of  a  character  will 
be  inherent  in  its  string  representation,  while  semantic  rules  will  be 
used  for  quantitatively  refining  the  information  available  in  a  given 
string.  In  order  to  simplify  the  notation,  the  following  codes  will  be 
used  to  denote  the  features  discussed  in  the  preceding  sections. 

p:  branch  point 
a:  lake 

s:  straight  line  segment 
c:  corner 
b :  bay 
x:  convex 
v:  concave 


It  is  noted  that  the  last  two  codes  refer  to  attributes  which  apply 
to  corners  and  bays.  Although  these  two  attributes  could  be  incorporated 
in  the  semantic  rules,  they  are  included  in  the  string  representation 
as  a  rugged,  overall  descriptor  to  differentiate,  for  example,  between 
5's  and  S's.  The  refinement  of  a  given  corner  or  bay  (i.e.,  direction, 
length,  etc.)  will  be  handled  via  the  semantics.  As  indicated  in 
Section  3,  the  degree  of  information  incorporated  in  a  string  vs.  a 
semantic  specification  is  an  arbitrary  trade-off.  The  approach  taken  here 
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is,  to  include  in  a  string, features  and  descriptors  which  can  aid  in  making 
gross  decisions  between  characters  early  in  the  recognition  stage.  It 
is  also  noted  that  end  points,  arcs,  and  polygonal  segments  are  not 
included  in  the  feature  codes.  The  reason  for  this  is  that  they  are 
either  implicit  in,  or  refined  into,  one  or  more  of  the  features  coded 
above  prior  to  recognition. 

In  order  to  reduce  the  complexity  of  the  recognition  stage,  it  is 
advantageous  to  organize  all  coded  strings  in  a  systematic  manner.  One 
way  to  accomplish  this  is  to  group  all  features  in  order  of  increasing 
complexity  in  the  feature  hierarchy  discussed  in  Sections  2.2  through  2.9. 
For  example,  a  character  composed  of  a  bay,  a  lake,  a  branch  point,  and 
a  straight-line  segment  would  be  coded  as  the  string  a  =  pasb.  The 
descriptors  x  and  v  precede  the  feature  they  modify.  If  the  bay  in  this 
example  were  convex,  the  complete  string  representation  would  then  be 
a  ■  pascb.  Multiple  features  of  the  same  class  are  grouped  together.  If, 
for  instance,  there  were  two  branch  points  the  string  would  be  a  =  ppascb. 
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III.  SYNTACTIC/ SEMANTIC  RECOGNITION 

3.1  Background 

The  existence  of  recognizable,  finitely  describable  structure  in 
a  pattern  is  essential  for  success  in  the  syntactic  approach.  Basically, 
a  formal  grammar  is  developed  to  generate  elements  of  a  language  that 
defines  a  pattern  class,  and  an  automaton  or  a  parsing  algorithm  is  used 
to  recognize  precisely  that  language  [2,3]. 

For  example,  the  grammar  G.j  *  (N,E,P,S,)  with  nonterminals  N  =  {S,B,C}, 
terminals  I  =  {a,b,c},  productions  P  =  (S  -*  aSBC,  S->aBC,  CB->BC,  aB  -> 
ab,  bB -*>  bb,  bC— >bc,  cC->cc},  and  starting  symbol  S,  can  be  shown  to 
generate  the  language 

UGj)  =  (y|y  =  anbncn,  n  >_  1 }  . 

If  the  terminals  a,b,c  have  an  interpretation  as  pattern  primitives  which 
are  unit-length  directed  line  segments 

a  /  b - *  c\ 

then  L(G-j)  defines  a  class  of  equilateral  triangles  via  a  trace  of  the 
boundary  of  a  triangle,  as  shown  in  Fig.  6.  The  nature  of  the  productions 
makes  grammar  G-j  a  context-sensitive  grammar. 

Many,  if  not  all,  practical  pattern  recognition  systems  that  use 
structural  models  are  in  fact  hybrid  systems  [4];  that  is,  they  are  combi¬ 
nations  of  structural  modeling  techniques  (primarily  using  syntactic  models) 
and  classical  decision-theoretic  techniques.  One  frequently  finds  that 
decision-theoretic  methods  are  used  to  identify  and  extract  the  primitives 
in  a  given  pattern,  then  syntactic  methods  are  used  to  attempt  the  final 
classification  by  an  analysis  of  the  relationships  among  the  primitives. 

The  productions  in  a  grammar  like  G^  above  are  purely  syntax  rules. 

They  define  implicitly  the  form  that  strings  of  terminals  must  have  in 
order  to  belong  to  the  language  L( G-j ) .  In  the  example  given  above,  that 


Pattern 


String  Representation 


Figure  6.  Patterns  and  their  representation  as 
strings. 
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form  is  anbncn  or,  in  words,  "at  least  one  a  followed  by  the  same  number 
of  b's  followed  by  the  same  number  of  c's."  But  the  productions  in  the 
grammar  G-|  do  not  deal  in  any  way  with  val  ues--numeri  cal ,  vector,  logical, 
or  otherwise--that  terminals  a,b,c  might  take  on. 

The  assignment  of  quantitative  information  to  features  in  a  syntactic 
pattern  recognition  formulation  is  accomplished  via  the  use  of  attributed 
grammars  [4-8],  The  term  "attributed"  in  this  context  means  that  we  will 
employ  a  conventional  syntactic  grammar,  but  will  now  allow  the  terminals 
and  nonterminals  to  have  associated  attributes  which  are  assigned  values 
by  some  predefined  mechanism.  For  example,  the  syntactic  description  of 
certain  types  of  2's  is  given  by  the  string  a  -  xbvc  (i.e.,  a  convex  bay 
followed  by  a  concave  corner).  As  discussed  in  Section  2,  each  feature 
has  a  given  set  of  attributes,  such  as  direction  and  length,  which  are 
quantifiers  of  that  feature.  Thus,  the  use  of  syntax  establishes  a  given 
basic  structure,  while  the  attributes  attach  quantitative  descriptions 
to  the  features  forming  that  structure.  The  rules  for  assigning  meaning 
to  the  resulting  attributed  structure  are  semantic  rules  analogous  in  form 
to  the  semantic  rules  described  previously  in  connection  with  hierarchical 
feature  extraction.  In  the  case  of  recognition,  however,  the  semantic 
rules  are  used  for  assigning  meaning  (i.e.,  valid  vs.  invalid  character) 
to  the  overall  structure.  Thus,  we  see  that  the  attributed-grammar 
approach  to  recognition  involves  three  principal  elements:  (1)  the 
specification  of  a  conventional  syntactic  grammar,  (2)  the  use  of 
attributes  for  quantifying  the  features,  and  (3)  the  specification  of  a 
set  of  semantic  rules  for  assigning  meaning  to  an  attributed  structure. 

There  are  two  major  reasons  for  considering  attributed  grammars  in 
structural  recognition.  First,  in  most  problems,  it  makes  more  sense  to 
identify  features  as  symbols  together  with  their  attribute  values  than 
it  does  to  try  to  package  all  information  about  the  primitives  into  a 
much  larger  set  of  symbols  without  attributes.  Second,  it  is  well  known 
that  the  use  of  attributes  and  associated  semantic  rules  can  dramatically 
reduce  the  complexity  of  the  syntactic  analysis  of  certain  classes  of 
patterns  (9). 

As  an  illustration,  we  reconsider  the  context-sensitive  language 
{y|y  a  anbncn,  n  >_  1 }  generated  by  the  grammar  given  earlier.  The 
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inclusion  of  some  very  simple  semantic  rules  allows  the  use  of  a  much 
simpler  regular  grammar,  as  follows.  The  regular  grammar  G£  =  (N,r,P,S) 
with  nonterminal  N  =  {S } ,  terminals  l  =  (a,b,c),  productions  P  =  {S  -*■  aS, 

S  bS,  S  -+  cS,  S  -*  c}  numbered  for  reference,  and  starting  symbol  S, 

generates  a  language  L^)  that  properly  contains  L ( Gi ) ;  thus,  all  strings 
in  L(G^)  are  syntactically  correct  for  LfGg)  as  well,  but  there  are 
additional  strings  in  L^)  that  must  be  rejected.  The  semantic  rules 
must  be  developed  to  disallow  all  derivation  trees  for  strings  in  L^)  - 
L(G^),  that  is,  all  strings  not  of  the  form  anbncn,n  >  1.  In  words,  these 
rules  require: 

(1)  all  uses  of  production  #m  before  production  #m  +  1  for  1  £  m  <  3; 
and 

(2)  the  same  number  of  uses  of  production  #1  as  of  production  #2  as 
of  productions  #3  +  #4. 

This  example  illustrates  the  fact  that  a  simple  regular  grammar, 
along  with  some  simple  semantic  rules,  can  be  made  to  behave  as  a  much 
more  powerful  context-sensitive  grammar.  As  will  be  seen  in  Section  4, 
this  is  an  important  point  because  learning  algorithms  for  regular 
grammars  are  relatively  easy  to  formulate. 

3.2  Specification  of  Semantic  Rules 

Although,  as  explained  in  Section  4,  it  is  possible  to  learn  regular 
grammars  by  utilizing  training  samples  in  a  grammatical  inference  algorithm, 
no  automatic  procedures  exist  for  specifying  semantic  rules.  This,  however, 
is  not  a  particularly  serious  limitation  if  the  semantic  rules  are 
specified  interactively  using  an  approach  such  as  the  one  suggested  in 
Section  6. 

The  semantic  rules  for  recognition  are  similar  in  nature  to  those 
already  discussed  for  the  hierarchical  feature  extractor,  with  the 
exception  that  they  apply  in  general  not  only  to  single  features,  but  also 
to  combinations  of  features,  as  well  as  the  production  rules  in  a  given 
grammar.  These  concepts  are  illustrated  in  the  following  section. 

3.3  Example  of  Syntactic/Semantic  Specification 

The  material  discussed  in  the  previous  two  sections  is  illustrated 
in  this  section  by  a  syntactic/semantic  specification  for  open-top  0's. 
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With  reference  to  the  feature  codes  discussed  in  Section  2.10,  a  grammar 
for  recognizing  a  large  class  of  this  type  of  character  is  given  by 

G  =  (N.E.P.S) 


where 


N  =  {S,A,B} 


Z  =  {b,c,x,v} 


and  the  productions  in  P  are 


(1)  S 

(2)  S 

(3)  S 

(4)  S 

(5)  A  -v  vc 


■*  xb 
xbA 
-*■  xcA 
-*■  vcS 


(6)  A  -f  vcB 

(7)  B  +  xb 

(8)  B  -*■  xc 

(9)  B  -*  xbS 

(10)  B  -  xcS 


The  semantic  rules  for  this  grammar  are 


tt 


r1  =  TRUE  if  the  average  direction  of  all  convex  bays  and 
corners  is  1  or  2  (see  Fig.  3). 
r^  =  TRUE  if  the  area  of  each  concavity  is  less  than  T^* 

[average  convexity  area). 

r3  =  TRUE  if,  when  production  (1)  is  applied,  the  degree  of  the 
bay  is  3  or  greater. 

r^  *  TRUE  if  (i)  production  (4)  is  not  repeated  consecutively, 
and  (ii)  production  (5)  does  not  follow  production  (3). 
r5  =  TRUE  if  OPENING,  defined  as  the  Euclidean  distance  between 
the  terminator  points,  is  less  than  T^*  [overall  length). 

+A  grammar  can  be  obtained  in  one  of  two  ways:  (1)  heuristical ly  by 
studying  characters  of  Interest,  or  (2)  by  formal  grammatical  inference 
techniques.  The  grammar  given  in  this  section  was  obtained  by  the  former 
approach.  Grammatical  inference  is  discussed  in  Section  4. 

++Semantic  rules  for  recognition  are  denoted  by  lower  case  r's  in  order 
to  differentiate  them  from  semantic  rules  for  the  hierarchical  feature 
extractor. 
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r6  =  TRUE  if  H/W  >  ly 

where  T1  and  and  T3  are  constant  threshold  values.  A  given  character 
is  recognized  as  an  open-top  zero  if  its  string  representation  is 
syntactically  correct  (determined  by  the  syntactic  recognizer,  as 
discussed  later)  and  U  r.  =  TRUE,  i  =  1,2,..., 6.  In  other  words,  the 
input  must  be  both  syntactically  and  semantically  correct  to  be  accepted. 

The  structure  (syntax)  of  acceptable  characters  is  established  by 
the  production  rules.  For  example,  the  first  production  yields  a  convex 
bay,  while  the  use  of  productions  (2),  (6),  and  (7)  yields  a  convex  bay, 
followed  by  a  concave  corner,  followed  by  a  convex  bay. 

Semantic  rule  r^  establishes  the  acceptable  direction  of  the  overall 
convexity  to  be  between  45°  and  135°.  (The  average  direction  of  all 
convexities  is  a  rugged  representation  of  the  direction  of  the  opening.) 
Semantic  rule  r^  precludes  large  concavities  and  thus  gives  a  low-level 
guarantee  of  regularity.  Rule  r3  establishes  a  minimum  regularity  of  the 
boundary.  For  example,  a  bay  of  degree  3  closely  resembles  a  triangle, 
which  is  deemed  unacceptable  for  a  zero.  Rule  r4  similarly  excludes 
ill-formed  characters.  Rule  r^  excludes  large  openings  ("large"  being 
measured  as  a  fraction  of  overall  length  in  order  to  make  this  rule 
insensitive  to  size.)  Finally,  rule  r^  precludes  zeros  which  are  short 
and  fat  beyond  a  given  threshold.  Figure  7  shows  some  acceptable  characters 
and  Fig.  8  shows  some  characters  which  would  be  rejected  as  being  either 
syntactically  or  semantically  incorrect. 

3.4  Recogni zer 

Grammars  were  shown  in  the  previous  three  sections  to  be  generators 
of  string  sets.  In  this  section,  we  consider  the  problem  of  recognizing 
a  given  syntactic/semantic  string  representation. 

In  terms  of  overall  system  specification--representation,  learning, 
and  recogni tion--the  most  practical  approach  is  to  employ  regular  grammars 
because  of  the  availability  of  learning  algorithms  and  the  simplicity 
of  formulation  for  the  recognizer.  As  indicated  in  Section  3.1,  the 
utilization  of  semantic  rules  allows  considerable  expansion  of  the 
pattern-representation  power  of  regular  grammars. 


Figure  8.  Examples  of  unacceptable  characters.  The  patterns 
shown  in  (a) ,  (b)  and  (c)  are  all  syntactically  correct,  but 
violate  semantic  rules  r5/  r.,  and  r~,  respectively.  The 
pattern  shown  in  (d)  is  Syntactically  incorrect. 


The  formal  recognizer  for  regular  languages  is  the  finite  automaton, 
defined  as  a  five-tuple  A  =  (Q,E,6,qo,F) ,  where 
Q  is  a  finite  set  of  states, 
i  is  a  finite  input  alphabet 

6  is  a  mapping  from  Q  x  l  into  2^,  the  collection  of  all 
subsets  of  Q, 

qQ  in  Q  is  the  starting  state,  and 

F, a  subset  of  Q,  is  a  set  of  final  or  accepting  states. 

We  say  that  a  given  string  is  recognized  by  A  if,  starting  in  state  qQ, 
the  automaton  is  capable  of  scanning  the  entire  string  and  it  is  in  one 
of  the  states  of  F  after  the  last  symbol  in  the  string  has  been  processed. 

As  an  illustration  of  this  notation,  consider  the  automaton  A  =  (Q,I, 
S,q0,F)  with 

Q  =  {qQ,qj},  l  =  ta,b},  F  =  (q^ 

and  mappings 

«(qQ.a)  =  (q0> 

«(qQ.b)  =  (q^ } 

S^.a)  =  6(q-j  ,b)  =  * 

where  <p  is  the  null  set  (undefined  states  in  this  case). 

A  finite  automaton  is  conveniently  represented  by  its  state  transition 
diagram,  a  directed  graph  whose  nodes,  corresponding  to  states,  are 
connected  by  arcs  that  are  labeled  with  input  symbols  which  cause  transi¬ 
tions.  By  convention,  all  final  states  are  denoted  by  double  circles  and 
the  starting  state  is  designated  by  an  entering  arrow. 

The  state  transition  diagram  for  the  example  just  given  is  shown 
in  Fig.  9.  It  is  noted  that  this  automaton  remains  in  state  qQ  for  any 
number  of  Input  a's,  and  makes  a  transition  to  the  final  state  when  a 
symbol  b  occurs  in  the  string.  Thus,  the  language  accepted  by  the  automaton 
consists  of  the  set  of  strings  of  the  form  {anb>,  n  ^  0.  All  other 
strings  are  rejected.  For  example,  input  abb  is  not  accepted  because 
6 (qQ ,a )  *  {qQ},  <5 ( qQ , b )  *  {q^},  but  6(q^,b)  *  <p,  causing  A  to  halt  because 
It  Is  unable  to  complete  processing  the  entire  string. 
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In  order  to  incorporate  the  semantic  specifications  discussed  in  the 
previous  section  into  the  recognition  procedure,  it  is  important  to  estab¬ 
lish  the  relationship  between  a  grammar  and  its  corresponding  automaton. 
This  relationship  is  based  on  a  fundamental  theorem  from  formal  language 
theory  which  states  that  a  language  is  recognized  by  a  finite  automaton 
iff  it  is  generated  by  a  regular  grammar  [3]. 

Given  a  regular  grammar  G  =  (N,E,P,X  ),  where  XQ  is  the  starting 
symbol,  the  corresponding  finite  automaton  A  =  (Q,l,6,qQ,F)  is  specified 
as  follows. +  Suppose  the  nonterminal  set  N  is  composed  of  the  starting 
symbol  XQ  and  n  additional  nonterminals  Xj  ,X2» . . . ,Xn>  Then,  the  state 
set  Q  of  A  is  formed  by  n  +  2  states  {qQ  ,q^ , . . . ,qn,qn+i }  such  that  q^ 
corresponds  to  Xi  for  0  <_  i  _<  n,  and  qn+1  is  an  additional  state  such 
that  F  =  {qn+-|}.  The  set  of  input  symbols  of  A  is  the  same  as  L  in  G,  and 
the  6  mapping  is  defined  by  two  rules  based  on  the  productions  of  G,  as 
follows:  For  0  _<  i  <  n,  0  £  j  _<  n,  and  any  a  in  E, 

1)  if  X.  -►  aX.  is  in  P,  then  <s(q.,a)  contains  q.,  and 

1  J  •  J 

2)  if  X .  -►a  is  in  P,  then  s{qi  ,a)  contains  qn+1  • 

As  an  illustration  of  this  procedure,  consider  the  grammar  for  open- 
top  O's  given  in  Section  3.3.  This  grammar  is  easily  converted  to 
regular  form  by  defining  the  following  symbols 

b-|  =  xb 
b^  =  xc 
b3  =  vc 

Using  these  symbols  and  the  above  notation  for  the  nonterminals,  ( i . e . , 

S  =  XQ,  A  =  X^,  and  B  =  X2),  the  grammar  becomes 

G  =  (N,r,P,X0) 

where 


N  =  {XQ  ,X1 ,X2) 

Z  =  (b^ , b2 ,b3  > 


similar  procedure  exists  for  obtaining  the  regular  grammar  corresponding 
to  a  given  finite  automaton  [3]. 
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and  the  productions  in  P  are 

0)  XQ  -  b]  (6)  X1  -  b3X2 

(2)  XQ  -v  b1X]  (7)  X2  -  b1 

(3)  XQ  -  b2X]  (8)  X2  -  b2 

(4)  XQ  -  b3Xo  (9)  X2  -  b^Q 

(5)  X1  -  b3  (10)  X2  -*  b2XQ 

Based  on  the  preceding  discussion,  the  automaton  A  =  (Q,E,S,qo,F) 
corresponding  to  this  grammar  has  the  state  set 

Q  -  <VWq3} 

symbol  set 

I  =  {blSb2,b3} 

and  final  state  set 

F  =  {q3} 

The  mappings  are  obtained  by  applying  the  two  rules  given  above  to  the 
productions  of  G.  For  example,  production  (5)  is  of  the  form  shown  in 
rule  2,  so  that  s(q^ ,b3)  =  {q3>,  while  production  (6)  is  of  the  form 
shown  in  rule  1,  which  gives  6(q^,b3)  =  {q2>.  Thus,  the  combined  mapping 
is  6(q-|,b3)  =  {q2  q3>.+  Following  this  procedure  yields  the  following 
mappings  corresponding  to  the  ten  productions  of  G: 

«(q  .bj)  = 

<5(qo»b2)  = 

6(%*b3)  =  {V 
6(q1,b3)  =  (q2.q3) 

5(q2»bi )  =  tqQ»q3^ 

«(q2,b2)  =  { qQ • q  3 } 

+The  fact  that  there  are  two  possible  transitions  for  this  state  with  the 
same  input  symbol  indicates  that  this  is  a  non-deterministic  automaton. 
As  shown  In  [3],  such  an  automaton  is  easily  made  deterministic  by 
introducing  additional  states. 
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All  other  transitions,  for  example  6 ( ,bQ ) ,  yield  the  null  set,  indicating 
an  unacceptable  input  string.  The  state  transition  diagram  is  shown 
in  Fig.  10. 

The  automaton  just  obtained  is  a  recognizer  based  on  the  structure 
(syntax)  of  strings  corresponding  to  open-top  0's.  In  order  to  incorporate 
the  semantic  rules  discussed  in  Section  3.3,  we  first  determine  if  a 
given  string  is  syntactically  correct  (i.e.,  if  it  is  accepted  by  the 
automaton).  If  it  is,  the  semantic  rules  are  tested  using  the  procedure 
discussed  in  Section  3.3.  The  only  exception  is  that,  instead  of 
productions,  we  use  the  corresponding  mappings  in  the  automaton.  For 
example,  semantic  rule  r^  would  now  read:  r4  =  TRUE  if  mapping  6(qQ,b3) 
is  not  repeated  consecutively,  and  (ii)  mapping  5(q-j,b3)  =  {q3J+  does  not 
follow  mapping  6^^).  This  type  of  information  can  be  easily  incorpo¬ 
rated  into  the  recognition  process  in  the  form  of  a  history  of  automaton 
transitions  as  a  string  is  processed. 


+Note  that  since  the  automaton  is  non-determlni Stic  the  transition  actually 
corresponding  to  production  (5)  must  be  used  in  testing  this  semantic  rule. 
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IV.  LEARNING 


4.1  Background 

As  indicated  in  the  previous  section,  one  of  the  principal  advantages 
of  using  syntactic/semantic  formulations  in  pattern  recognition  is  that 
it  often  allows  utilization  of  grammars  which  are  considerably  simpler 
than  those  that  would  be  required  if  only  syntax  were  employed.  In 
particular,  the  use  of  regular  grammars  enjoys  the  distinct  advantage  of 
having  the  best-developed  learning  algorithms.  The  algorithm  presented 
in  this  section  relies  only  on  one  input  parameter,  and  its  behavior  as 
a  function  of  this  parameter  is  fully  understood.  As  will  be  shown  in  the 
following  discussion,  this  procedure  learns  the  structure  of  a  finite 
automaton  directly  from  a  sample  set  of  training  patterns  expressed  in  the 
form  of  strings. 

4.2  Learning  Algorithm 

Let  R  be  a  set  of  pattern  strings  (including  the  empty  string)  and 
let  z  be  a  string  in  z*  such  that  zw  is  in  R  for  some  w  in  r*.+  Given 
a  non-negative  integer  k,  the  k-tail  of  z  with  respect  to  R  is  defined  as 
the  set  {w  j zw  in  R ,  ( w ]  <_  k).  In  other  words,  the  k-tail  of  a  string  z  is 
a  set  consisting  of  all  the  strings  w  subject  to  the  conditions  that,  for 
any  particular  w,  the  string  zw  is  in  R  and  the  length  of  w  is  less  than 
or  equal  to  k.  For  notational  convenience,  we  denote  the  k-tail  set  as 

h(z,R,k)  =  {w|zw  in  R,  |w|  <_  k} 

This  notation  clearly  shows  the  functional  dependence  of  the  k-tail  on  z, 
R,  and  k. 

Using  the  k-tail  definition,  an  automaton  corresponding  to  R  and  a 
given  value  of  k  is  obtained  by  means  of  the  following  procedure  (3): 

(1)  z  is  formed  from  all  the  different  symbols  in  R. 


4» 

Z*  is  the  notation  used  to  represent  the  set  of  all  strings  formed  from 
symbols  of  r,  including  the  empty  string. 


¥ 
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(2)  The  initial  state  is  given  by 

qQ  =  h(A,R,k) 

where  A  is  the  empty  string. 

(3)  The  set  of  states  is  given  by 

Q  =  (q  |  q  =  h(z,R,k)  for  z  in  E*} . 

(4)  the  final  state  set  is  given  by 

F  =  (q  |  q  in  Q,  A  in  q}. 

(5)  The  mappings  from  a  state  q  with  an  input  symbol  "a"  are 
given  by 

6 ( q , a )  =  {q "  j  q'  in  Q,  q"  =  h(za,R,k), 
q  =  h(z,R,k) } 

This  procedure  for  obtaining  an  automaton  is  best  explained  by  an 
example.  Suppose  that  R  =  (a,ab,abb>  and  we  let  k  =  1.  From  Step  1,  we 
have  E  =  {a,b}.  Step  2  gives  the  starting  state  as  qQ  =  h(A,R,l).  From 
the  above  definition  of  h, 

h(A,R,l)  =  {w  |  Aw  i  R,  |wj  <_  1} 

=  {a} 

=  % 

In  other  words,  since  |a)  =0,  the  only  string  Aw  that  is  in  R,  and  has 
length  less  than  or  equal  to  1,  is  a.  Thus, the  set  {a}  is  defined  as 
corresponding  to  state  qQ. 

The  set  Q  of  states  is  obtained  from  Step  3  by  changing  z.  As  shown 
above,  when  z  =Awe  have 


h(A,R,l)  *  {a} 
=  % 

Next,  we  select  z  =  a  and  obtain 
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h ( a , R , 1 )  =  (w  |  aw  in  R ,  | w J  £  1 } 

=  (X.b) 

=  ql 

In  this  case  the  only  strings  w  that  can  be  appended  to  z  =  a  with  zw 
being  in  R  and  |  w  [  <_  1  are  X  and  b.  We  define  this  new  set  as  state  q^ . 
Next  we  consider  the  string  z  =  ab  and  obtain 

h(ab,R,l)  =  {w  {  abw  in  R,  jw|  £  1} 

=  { x  ,b } 

=  ql 

which  does  not  produce  a  new  state.  The  next  string  is  z  =  abb.  This 
yields , 

h(abb,R,l)  =  {X} 

=  q2 

Other  strings  z  in  z*  will  yield,  in  this  case,  strings  zw  that  are  not  in 
R,  giving  rise  to  a  fourth  state,  denoted  by  q  .  which  corresponds  to  the 
condition  that  h  is  the  null  set.  Therefore,  we  have  the  state  set 

o  -  <vwV 

According  to  Step  4,  the  final  state  set  is  given  by 

F  =  {qjq  in  Q,Xin  q> 

Note  that,  since  both  q^  and  q2  contain  x,  these  two  states  are  in  the  final 
state  set. 

Finally,  the  mappings  are  obtained  using  Step  5.  Starting  with  qQ, 
we  obtain 

6(qQ,a)  =  {q'  I  q"  in  Q,  q'  =  h(xa,R,l), 
qQ  =  h(x,R,l ) } 
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Since  q'  =  h(xa,R,1)  =  h(a,R,l)  =  ,  we  have 

S(q0»a)  =  q1 


Similarly, 


<5(q0,b)  =  {q"  j  q"  in  Q,  q"  =  h(xb,R,l) 

qQ  =  h{x,R,l ) } 


The  second  step  follows  from  the  fact  that  q-  =  h(b,R,l)  =  q  . 

Next  we  consider  transitions  from  q^ ,  which  has  two  representations 
h(a,R,l)  and  h(ab,R,l).  Using  the  first  representation  yields 

{(q^a)  =  {q"  |  q'  in  Q,  q"  =  h(aa,R,l), 

q-j  =  h(a  ,R,1 } 

=  q* 

Using  the  second  representation  we  obtain 

5(q-j,a)  =  {q"  |  q'  in  Q,  q'  =  h(aba,R,l), 

q1  «  h(ab,R,l ) } 

The  transitions  from  q^  with  an  input  of  b  are  similarly  given  by 

S^.b)  =  {q'  i  q'  in  Q,  q'  =  h(ab,R,l), 

q-|  =  h(a  ,R,1 ) } 

=  ql 

and 


6( q-j  ,b)  =  (q'  |  q*  in  Q,  q'  =  h(abb,R,l) 

=  h(ab,R,l ) } 

=  q2 


Following  this  procedure  for  q,  and  q^  yields  the  following  mappings: 

c  $ 


r 
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<S(q2,a)  = 

S(q2,b)  =  q^ 

6<Va)  =  % 

6<Vb)  =  % 

The  automaton  just  obtained  from  the  strings  in  R  is  shown  in  Fig.  11. 
It  is  of  interest  to  note  that  the  automaton  recognizes  strings  of  the  form 
abn,  n  ^  0.  This  is  a  reasonable  generalization  of  the  structure  present 
in  the  strings  of  the  learning  set:  R  =  (a,ab,abb). 

4.3  Properties  of  the  Inferred  Automaton 

Given  a  specific  string  set  for  learning,  the  procedure  discussed  in 
the  previous  section  has  a  very  important  characteristic:  it  depends  only 
on  the  parameter  k.  Furthermore,  the  behavior  of  the  learning  method  is 
known  to  have  some  useful  properties  as  a  function  of  k.  Letting  L[A(R,k)] 
represent  the  language  accepted  by  the  inferred  automaton.  A,  for  a 
specific  R  and  k,  these  properties  may  be  stated  as  follows  [3] : 

(1)  For  any  k  _>  0,  R  is  a  subset  of  L[A(R,k)]. 

(2)  If  k  >  m,  the  length  of  the  longest  string  in  R,  then 
L  [A(R,m)l  =  R. 

(3)  If  k  =  0,  then  L  [A( R ,0) ]  =  £*. 

(4)  L[A(R,k+l)]  is  a  subset  of  L[A(R,k)]. 

The  first  property  guarantees  that,  as  a  minimum,  A  will  accept  all 
the  strings  in  R.  Property  2  guarantees  that  A  will  accept  only  R  if 
we  set  k  >_  m.  Property  3  states  that  k  =  0  gives  a  useless  result:  an 
automaton  that  accepts  all  strings  composed  of  symbols  from  1.  Finally, 
Property  4  simply  states  that  increasing  k  increases  the  selectivity  of 
the  recognizer.  From  these  properties,  it  is  easily  seen  that  k  must  be 
in  the  range  0  <  k  £  m. 
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V.  CHECKING  FOR  CLASS  SEPARABILITY 

One  of  the  most  important  issues  in  the  design  of  pattern  recognition 
systems  is  to  have  an  a  priori  idea  of  how  well  a  set  of  selected  features 
discriminate  between  different  pattern  classes.  In  the  present  problem, 
this  is  equivalent  to  establishing  whether  or  not  two  or  more  automata 
(of  different  classes)  recognize  the  same  subset  of  strings.  In  other 
words,  a  string  that  is  recognized  by  more  than  one  automaton  is  a  result 
of  overlapping  pattern  classes.  As  discussed  below,  another  important 
property  associated  with  using  finite-state  automata  as  recognizers  is 
that  checking  for  this  type  of  overlap  is  reasonably  straightforward. 

Suppose  we  have  two  regular  languages  L^  and  L^,  recognized  respec¬ 
tively  by  deterministic  finite  automata  A1  =  (Qj ,Ej ,6^ ,qQ ,F^ )  and 

A2  =  (Q2,E2»l52’%’fV  ’  that  1S*  h  =  and  L2  =  A2  ’  Then  an 

automaton  A  such  that  L( A)  =  L( A-j )  L( A?)  is  given  by 

A  =  (Q  ,E  ,6  ,q ,  F) 


where 

(1)  Q  =  Q1  X  Q2  ={(q,q)|q e  Q1 ,q  e  Q2> 

(2)  E  =  UE2 

(3)  S  is  a  mapping  from  Q  x  E  onto  Q  such  that,  for  any 

symbol  "a"  from  E, 

<5((q,q),a)  =  (61(q,a),62(q,a)) 

(4)  (q  ,q0)  is  the  starting  state 

(5)  F  =  {(q,q)|qeF1,  SeF^ 

This  automaton  accepts  an  input  string  if  and  only  if  both  A^  and 
A2  accept  it.  In  other  words,  A  recognizes  all  string  in  the  intersection 
of  L^  and  L2>  The  notation  introduced  above  is  clarified  by  the  following 
example. 

Consider  the  automata 


A1  s  (^1  *^*1 ’^1 ’^o’^l  ^ 


v-  — 


A 
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with 


Ql  = 

{VVq2} 

rl  = 

(a  ,b> 

v 

Vva)  =  q  o 

^(^.h)  =  q! 

5 j (q  »a)  =  q2 

6](q1,b)  =  qj 

( q2  *a )  =  q2 

«1(q2>b)  *  q2 

F1 = 

fq1  > 

with 


^2  ~  ^2 2 ’^2 ,qo  * F2^ 

Q2  =  ^  » ^4  *^5  ^ 

T2  =  ta.b) 

V  ; 

«2<Va,*qo 

«2(q0.t>) 

«2(q4»a)  =  q^ 

«2(q4.b) 

«2(q5.a)  =  q5 

®2(  q5  ^ 

FZ  =  'V 

The  state  transition  diagrams  for  and  are  shown  in  Figs.  12(a) 
and  (b).  The  languages  recognized  by  these  automata  are,  respectively. 


L1  =  L(A1 )  =  an  b  bm  n,m  >  0 

Lg  =  L^)  =  an  b(a  +  b) ( a  +  b)m  n,m  >_  0 

where  the  "+"  indicates  "or".  That  is,  an  element  of  L(A^ )  is  a  (possibly 
empty)  string  of  a's  of  length  n  (n  0),  followed  by  a  string  of  at 
least  one  b,  and  an  element  of  L^)  is  a  (possibly  empty)  string  of  a's 
followed  by  at  least  one  b,  followed  by  a  string  of  at  least  one  a  or  b, 
and  any  combination  of  a's  or  b's  thereafter. 


TJrV- 


In  order  to  form  the  automaton  A  which  recognizes  L(A.| )  n  |_(A2) ,  we 
proceed  as  follows.  Using  the  notation  introduced  above, 

A  =  (Q,r,6,qo,F) 

where  the  starting  state  is  qQ  =  (qo»qo)»  and 

Q  =  Q]  X  Q2  =  {(qo,qo),(qo,q4),(qo,q5),(q1  ,qQ), 

( q -j  «q5)»(q2>qg)  »(qQ *^4^% ,c^ ^ 

z  =  z]  yE2  =  {a.b} 

6: 

6((q0,q0)*a)  *  <V%} 

6((q1 ,q4)»a)  =  (q2»q5) 

«((q2.q5).a)  =  ( ^2  ,c*5  ^ 

6((qi»q5).a)  =  (q2.q5) 

F  =  {(qrq5)} 

It  is  noted  that  the  set  Q  consists  of  all  ordered  pairs  of  states 
in  Q-j  and  Q2>  Also  the  notation  (q.,q.)  refers  to  a  single  state  of  A, 
and  is  used  to  accentuate  the  fact  that  a  state  of  A  arises  from  states 
q..  and  q^  in  A^  and  A2»  respectively.  Thus,  A  may  be  viewed  as  implementing 
A.|  and  A2  simultaneously  in  parallel.  Automaton  A  acts  as  A-j  and  A2 
cfriven  by  the  same  input,  with  A  accepting  an  input  if  and  only  if  both 
A-j  and  A2  accept  it. 

With  reference  to  the  above  5  mappings,  it  is  noted  that  no  transition 
functions  were  specified  for  (qQ ,q4)  ,q0)  ,(qQ  ,q5)  ,(q1  ,q0) ,  and  (q2,q4). 

The  reason  for  this  is  easily  explained  by  noting  the  fact  (see  Fig.  12) 
that  these  five  composite  states  are  combinations  of  states  in  A^  and  A2 
at  least  one  of  which  is  unreachable  starting  from  qQ  and  qQ,  respectively. 
This  implies  that  no  input  string  exists  which  can  cause  A  to  reach  any 
of  the  five  states  listed  above  by  starting  at  its  initial  state  (qQ,qo). 


6((q0,q0),b)  =  ( q]  ,q4) 
5((q],q4),b)  =  (q-,  >q5 ) 
s((q2>q5)>b)  =  (q2>q5) 
<s((q1  ,q5),b)  =  (q1  ,q5) 


These  unreachable  states  can  be  eliminated  from  the  state  set  Q  without 
affecting  L( A) . 

The  language 

L(A)  =  L(A1)AL(A2)  =  an  b  b  bm 

n,rn  ^  0 

consists  of  strings  which  have  the  form  of  a  series  (possibly  empty)  of 
a's  followed  by  a  string  of  at  least  two  b's,  which  are  the  strings  which 
are  accepted  by  both  A^  and  A,,. 

The  procedure  just  discussed  has  two  important  implications.  First, 
if  L(A-j)  and  L(A2)  are  disjoint,  then  L(A)  will  be  the  empty  set,  indicat¬ 
ing  perfect  recognition  by  A^  and  Second,  if  L(A)  is  not  empty,  the 
structure  of  the  strings  in  the  overlapping  subset  can  easily  be 
examined.  As  indicated  in  the  following  section,  one  approach  for 
eliminating  the  overlap  is  to  use  semantics.  It  is  also  noted  that  the 
method  is  easily  extended  to  a  multiclass  situation  simply  by  considering 
two  classes  at  a  time. 
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VI.  ORGANIZATION  OF  THE  SYNTACTIC/SEMANTIC  CHARACTER  RECOGNITION  SYSTEM 

The  purpose  of  this  section  is  to  organize  the  material  presented  in 
the  previous  sections  in  the  form  of  a  system  which  utilizes  both  syntax 
and  semantics  in  the  recognition  process. 

The  basic  approach  proposed  for  designing  (training)  the  recognition 
system  is  shown  diagramatically  in  Fig.  13,  where  the  dashed  lines  indicate 
interactive  user  inputs.  The  various  stages  of  this  approach  are 
discussed  in  the  following  paragraphs. 

The  training  set  consists  of  a  set  of  thinned  characters  of  known 
classification.  The  hierarchical  feature  extraction  and  attribute  assign¬ 
ment  stage  is  based  on  the  methods  discussed  in  Section  2.  It  is  noted 
that,  although  the  features  to  be  extracted  are  well  defined,  the 
semantic  rules  of  some  of  these  features  require  the  specification  of 
one  or  more  thresholds.  For  example,  the  semantic  rules  for  lake  features 
require  the  specification  of  thresholds  ^(4)  and  T,,(4).  The  procedure 
for  doing  this  is  to  compute  this  particular  feature  for  all  appropriate 
characters  in  the  training  set  and  then  to  select  the  thresholds  as  the 
minimum  values  which  encompass  all  acceptable  lakes,  with  the  degree  of 
acceptability  being  established  by  the  user  via  a  display  examination  of 
these  features.  In  other  words,  the  user  must  determine  what  constitutes 
an  acceptable  lake  feature  and  the  thresholds  are  used  to  place  limits  on 
the  class  of  acceptable  features.  The  selection  of  thresholds  for  other 
features  is  carried  out  in  a  similar  manner. 

The  next  step  in  the  training  procedure  is  to  arrange  the  extracted 
features  in  the  form  of  a  string,  as  discussed  in  Section  2.10.  These 
strings  are  then  fed  into  the  grammatical  inference  stage,  where  an 
automaton  is  generated  for  each  class  using  the  method  discussed  in 
Section  4.  It  is  noted  that  the  only  parameter  required  by  the  inference 
algorithm  is  a  value  of  k  to  establish  the  k-tail. 

Given  a  value  of  k,  the  resulting  automata  are  then  used  to  recognize 
the  training  set.  It  is  expected  that  this  stage  of  the  process  will 
require  the  most  intensive  user  interaction.  The  automata  generated  by 
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the  inference  algorithm  provide  the  basic  syntax  (structure)  of  the 
training  set  subject  to  the  limitations  inherent  in  regular  grammars.  It 
is  unlikely  that  this  formalism  by  itself  will  be  sufficient  to  completely 
classify  the  training  set  correctly.  The  two  tools  available  to  increase 
discrimination  are  the  check  for  class  separability  shown  as  the  next 
stage  in  Fig.  13,  and  the  use  of  semantic  rules.  The  set  of  overlapping 
strings  in  any  two  classes  can  easily  be  established  using  the  procedure 
discussed  in  Section  5.  This  information  can  be  used  to  study  the  struc¬ 
ture  of  the  strings  that  are  not  uniquely  recognizable,  and  semantic 
rules  can  be  introduced  to  resolve  the  conflicts,  as  discussed  in 
Section  3.  It  is  noted  that  the  use  of  a  display  to  show  the  appropriate 
automata  and  to  highlight  the  state  transitions  followed  in  recognizing  a 
given  string  will  be  a  valuable  aid  in  establishing  the  necessary  semantic 
information. 

The  design  of  the  syntactic/semantic  recognizer  is  complete  once 
the  training  set  is  recognized  with  acceptable  accuracy.  During  automatic 
operation,  the  structure  of  the  system  consists  of  the  stages  shown  in 
Fig.  14.  In  this  mode  of  operation  a  character  can  be  rejected  prior  to 
going  into  the  recognition  stage  if  its  features  and  attributes  are 
outside  the  learned  thresholds  or  fail  to  satisfy  the  corresponding 
semantic  rules.  If  a  character  passes  this  test,  it  is  fed  into  the  recog¬ 
nizer  (automata).  At  this  point  it  is  assigned  to  a  character  class  or 
rejected  if  it  fails  to  be  accepted  by  a  unique  automaton  based  on  the 
syntactic/semantic  information  developed  for  this  stage  during  the 
training  phase. 


Figure  13.  Structure  of  the  syntactic/semantic  recognizer 
during  training.  Dashed  lines  indicate  interactive  inputs. 
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VII.  CONCLUSIONS  AND  RECOMMENDATIONS 

The  material  discussed  in  the  previous  sections  represents  a  unified 
approach  for  the  development  of  a  syntactic/semantic  character  recognition 
system.  The  most  important  aspects  of  this  approach  are:  (1)  a  hierarchical , 
semantics-based  feature  extractor,  (2)  a  formulation  that  leads  to 
representations  that  can  be  handled  with  string  grammars,  (3)  the  use  of 
semantics  in  the  recognition  process,  (4)  the  use  of  a  procedure  for 
studying  class  separabi 1 i ty,  and  (5)  a  proposed  interactive  approach 
which  combines  automatic  syntactic  processing  with  user-defined  semantic 
rules . 

Although  the  overall  system  structure  has  been  developed  in  some 
detail,  there  are  a  number  of  areas  that  require  further  investigation. 

In  particular,  we  recommend  that  the  following  tasks  be  carried  out  as  the 
next  step  in  this  project. 

Task  1.  Extension  of  semantic  rules  for  the  hierarchical  feature  extractor. 
The  semantic  rules  proposed  in  the  report  are  preliminary.  This 
task  will  consist  of  extending  and  refining  the  semantic  rules 
for  feature  description  in  the  context  of  the  NORDA  OCR  system. 

Task  2.  Evaluate  the  parsing  approach  to  recognition.  This  task  will 

investigate  the  formulation  of  a  parsing  (vs.  automaton)  approach  to 
recognition.  Parsing  algorithms  are  generally  faster  and  this 
task  will  address  the  problem  of  incorporating  semantic  rules  into 
the  parsing  process. 

Task  3.  Extend  the  semantic  rules  for  the  syntactic/semantic  recognizer . 

Considerable  work  remains  to  be  done  in  proposing  semantic  rules 
for  the  recognition  stage.  This  task  will  address  the  problem 
of  specifying  a  set  of  semantic  rules  for  each  of  the  ten 
numeral  classes,  with  possible  extension  to  alphanumerics. 

Task  4.  Extend  the  ~vntactic/semantic  approach  to  border-oriented 

features .  Work  done  to  date  on  the  syntactic/semantic  approach 
has  been  focused  on  skeleton-oriented  features.  It  is  known 
that  border-oriented  features  can  be  very  useful  in  situations 


involving  characters  such  as  filled-in  8's.  This  task  will 
address  the  extension  of  the  hierarchical  feature  extractor  and 
the  syntactic/semantic  recognizer  for  handling  border  features. 

Task  5.  Refine  the  interactive  approach  used  in  the  design  of  the 

recognition  system.  The  approach  used  in  the  specification  of 
semantic  rules  is  highly  interactive.  This  task  will  consist  of 
developing  a  specific  formulation  for  the  implementation  of  this 
approach,  including  techniques  for  user  specification  of 
relevant  parameters. 
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APPENDIX  A 


The  following  discussion  deals  with  elliptical  symmetry.  This  type 
of  feature  is  useful  in  character  recognition  for  determining  the  quality 
of  characters  composed,  either  partially  or  entirely,  of  elliptical 
segments,  such  as  9's  and  0's.  Two  procedures  are  developed  below.  The 
first  is  based  on  a  minimum-error  elliptical  fit,  while  the  second  applies 
to  any  type  of  symmetry  aboi*t  two  principal  axes.  Both  methods  are 
independent  of  rotation. 


A.l  Procedure  1 

Given  K  sets  of  points,  E.,i=l,2 . K,  with  set  Ei  containing  fE^ 

points,  the  following  procedure  individually  measures  elliptical  symmetry 
about  two  principal  orthogonal  axes  for  each  set.  In  terms  of  character 
recognition,  each  set  E.  contains  the  coordinate  points  of,  for  example, 
the  skeleton  of  a  character,  and  i  ranges  over  the  number  of  characters 
(K)  to  be  processed. 

(a)  Let  {xn,yn},  n  =  1,2 . #E^ ,  represent  the  coordinates  of 

all  points  in  E^ . 

(b)  Define  the  column  vectors  =  (xn,yn)',  where  the  prime  (') 
indicates  transposition. 

(c)  Compute  the  2x2  covariance  matrix 


where 


(d)  Compute  the  two  orthogonal  eigenvectors  and  corresponding 
eigenvalues  of  C^.  (Since  the  matrix  is  real  and  symmetric, 
the  existence  of  orthogonal  eigenvectors  is  guaranteed. 
Almost  any  scientific  package  will  contain  a  subroutine  for 


computing  the  eigenvectors  and  eigenvalues  of  a  real,  symmetri 
matrix).  These  two  eigenvectors,  denoted  by  e^  =  (a,b)'  and 
£2  =  (c,d)^,  point  in  the  directions  of  principal  data  spread, 
subject  to  the  orthogonality  constraint.  The  amount  of 
spread  is  proportional  to  the  eigenvalues  and  is  assumed 
for  notational  convenience  that  the  largest  eigenvalue 
corresponds  to  £j . 

(e)  The  equation  of  an  ellipse  in  the  (x,y)  plane  centered  about 
m.j  and  with  e-|  and  £2  as  the  principal  axes  is  given  by 

(z.  *  HL^'CT^z  -  m. )  -  e  =  0 

where  e  is  a  threshold  that  controls  the  size  of  the  ellipse. 

(f)  Find  a  least-squared  elliptical  fit  to  the  set  of  points 
{(xn,yn)}  by  finding  a  value  of  e  which  minimizes  the  quantity 

R(e)  =  l  [(z^  -  -  m.)  -  e]2  (1) 

n=l 

It  is  shown  below  that  both  the  expected  value  of  (1)  and  the  threshold 
which  minimizes  this  expression  (i.e.,  gives  the  optimum  fit)  are  equal  to 
the  dimensionality  of  z.  Thus,  given  a  set  of  points  to  be  tested  for 
symmetry,  we  obtain  a  measure  of  elliptical  symmetry  by  computing  (1)  with 
the  optimum  threshold  and  either  comparing  the  result  against  a  perfect 
ellipse  (i.e.,  zero  error  in  (1))  or  against  the  expected  value  of  (1). 

The  first  approach  is  applicable  when  a  fine  measure  is  desired,  while  the 
second  approach  is  more  rugged. 

A. 2  Expected  Value  of  Q(zj 
Let 

Q(z)  =  (z  -  m)"C_1 (z  -  m)  (2) 

where  £  is  a  random  vector  of  dimension  d,  and  m  and  C  are  the  mean  vector 
and  covariance  matrix  of  the  population  from  which  the  £'s  are  drawn. 

Consider  the  linear  transformation 
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where  A  is  a  d  x  d  matrix.  Then  the  mean  vector  of  the  jj ' s  is  given  by 


m*  =  E(u^) 

■  E(A  z) 
=  A  E(z) 
=  A  m 


(4) 


Similarly,  the  covariance  matrix  of  the  u's  is  given  by 

C*  =  E{(^  -  m*)(£  -  m*)"} 

=  E { ( A  z  -  A  m)( A  z  -  Am)"} 

~  ~  (5) 

=  A  E{(z^  -  m)(^  -  m)"}A" 

=  A  C  A" 


Since  is  a  symmetric  matrix,  a  complete  set  of  orthonormal  eigenvectors 
for  this  matrix  can  always  be  found.  If  the  rows  of  A  are  chosen  as  these 
vectors,  then  Eq.  (3)  becomes  the  Hotelling  transform  and  it  is  well  known 
that  C*  will  be  a  diagonal  matrix  with  main  diagonal  component,  x^,  equal 
to  the  variance  of  the  kth  component  of  jlj ,  for  k  =  l,2,...,d. 

From  Eqs .  (2)  through  (5), 


Q(jj )  =  (u.  -  m*)"C*-1(£  -  m*) 

=  {z_  -  nO'A'CA'fVV^z  -  m) 
=  (z  -  m) "C~^  {z  -  m) 

-  fl(z) 


(6) 


It  then  follows  that 


E{Q(z)}  =  E{Q(u)} 

However,  since  £*  is  a  diagonal  matrix, 

E{Q(u)}  =  E{(u  -  m*)  C*_1(u  -  m*)} 


(7) 


t-A.  **.  *1 


r 
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=  E 


(uk  -  mk>‘ 


k=1 


d 

A 


E<K 


"k>2> 


(8) 


where  is  the  variance  of  component  which, based  on  the  above  discussion, 
is  equal  to  From  Eqs.  (7)  and  (8),  we  then  have 


d 

E{Q(z)}  =  l 
k=l 


=  d 


(9) 


That  is,  the  expected  value  of  Q(z)  is  equal  to  the  dimensionality  of  z. 

A. 3  Optimum  Threshold 
Let 


•MV  '  (4  -S,>  <>a> 

Then  (1)  may  be  expressed  as 


The  minimum  of  this  expression  is  obtained  setting  the  partial  derivative 
with  respect  to  0  equal  to  zero  and  then  solving  for  e.  The  result  is 


A 

e 


,  'Ei 

#ti  n-1  1  ^ 


(12) 
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where  0  denotes  the  value  of  e  which  minimizes  (11).  The  right  side  of 
Eq.  (12)  is  recognized  as  an  approximation  to  the  expected  value  of  ( z) , 
where  the  i  in  this  context  denotes  the  population  of  vectors  belonging 
to  set  E,. .  It  then  follows  from  Eqs.  (7),  (9),  and  (12)  that 

e  *  E{Qi(z)}  =  d  (13) 


In  other  words,  the  value  of  e  which  minimizes  (11)  is  equal  to  the 
dimension  of  the  z' s. 

For  character  recognition  applications,  the  z's  are  pixel  coordinates 
and  d  =  2.  Thus,  the  least  square  elliptical  fit  to  the  points  in  E.  is 
obtained  in  this  case  by  setting  0  =  6  =  2  in  (1). 


A. 4  Procedure  2 

This  procedure  also  uses  the  eigenvectors  described  above,  but  is 
more  general  in  the  sense  that  it  applies  to  any  type  of  symmetry  about 
two  principal  axes. 

(a)  Repeat  steps  (a)  through  (d)  in  Procedure  1. 

(b)  The  perpendicular  distance  between  any  point  z^  in  E^  and  a  line 

containing  e^  is  given  by 


Dnm 


2  2  1/2 

where  || H  =  +  “  ]  '  .  Compute  the  average  perpendicular 

distance  between  this  line  and  all  points  lying  on  its  positive 
side  (z^  lies  on  the  positive  side  of  the  line  if  z^e^  >  0)*  This 
average  distance  is  given  by 


<<*)  -7  1 


where  the  summation  is  taken  over  values  of  n  for  which  >  0 
and  N+  is  the  number  of  points  satisfying  this  condition. 

(c)  Compute  the  average  perpendicular  distance  of  the  points  lying 
on  the  negative  side  of  the  line  containing  e ^  .  This  quantity 
is  given  by 
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d;«>  ■  IF  I  “„«> 

where  the  summation  is  taken  over  values  of  n  for  which 
z^e^  _<  0  and  N"  is  the  number  of  points  satisfying  this 
condition. 

(d)  Repeat  steps  (b)  and  (c)  using  e^  to  obtain  D^O)  and  D^i). 

(e)  Define  symmetry  measures  about  e^  and  e2  as  s-j(i)  s 

|o|(i)  -  ( i )  |  and  s2(i)  =  |  D^(  i )  -  DjUi)!  for  i  =  1,2,...K. 

An  average  measure  of  deviation  from  symmetry  about  the  principal  axes  is 
given  by  the  respective  values  of  s^i)  and  s2(i).  If*  for  example,  the 
points  in  E.  are  symmetrical  about  these  axes,  then  s^(i)  =  s2(i)  =  0. 


f  r 
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APPENDIX  B 

The  following  discussion  deals  with  a  procedure  for  corner  detection 
and  quantification.  The  method  was  developed  in  an  attempt  to  incorporate 
both  local  and  global  information  in  the  corner  detection  problem.  After 
experimenting  with  the  technique,  however,  we  found  that  it  lacks  sensi¬ 
tivity  and  that  it  performs  no  better  than  the  simpler  approaches  suggested 
in  our  earlier  work  [1],  The  procedure  is  included  here  for  completeness 
and  because  its  development  contains  some  concepts  that  may  be  useful  in 
other  contexts.  Its  adoption  as  a  useful  processing  tool  is  not  recommended. 

B. 1  Background 

A  corner  may  be  defined  as  the  fortuple  C  =  (c,a,6,e)  where  c,  the 
corner  point,  is  the  point  of  intersection  of  two  straight  line  segments, 
a  and  6,  with  lengths|aj>0  and | e | >  0,  respectively.  It  is  assumed  that 
the  line  segments  meet  at  one  of  their  extremes,  forming  an  interior 
angle  0.  The  corner  is  said  to  be  acute  if  0  <  0  <  7 ;/2,  right  if  e  =  rr/2, 
obtuse  if  rr/2  <e  <  it,  and  degenerate  if  8  =  0  or  0  =  i. 

In  the  continuous  domain,  and  in  the  absence  of  noise,  the  relationship 
between  0,  |a[,  and  |ej  becomes  important  only  near  degeneracy  or  when  |a| 
or  j 3 j  approach  zero.  In  the  examples  shown  in  Fig.  1,  for  instance,  one 
would  have  difficulty  in  visually  recognizing  the  presence  of  a  corner  only 
when  0-+O,  0~mt ,  | at J-+0 ,  or  |b|->0.  In  the  presence  of  noise,  however,  the 
relative  values  of  these  parameters  play  a  central  role  in  our  ability  to 
detect  a  corner,  as  illustrated  in  Fig.  2.  Part  (a)  of  this  figure  shows 
an  acute  corner  which,  for  all  practical  purposes,  fr;  been  rendered  undetect¬ 
able  by  the  presence  of  noise.  Figure  2(b),  by  contrast,  shows  a  right 
corner  with  the  same  values  of  ja|  and  |e|  and  corrupted  by  the  same  amount 
of  noise.  This  figure  Is  clearly  closer  to  our  Intuitive  concept  of  a 
corner,  thus  illustrating  the  importance  of  0  in  establishing  corner-like 
properties  in  a  noisy  segment. 

The  relative  importance  of  |a|  and  |e|  is  illustrated  in  Fig.  3. 

Part  (a)  of  this  figure  shows  an  undetectable  acute  corner  in  which  |a|  and 
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Fig.  2.  Effect  of  0  in  the  detectability  of  corners  in 
noisy  segments,  (a)  Noisy  obtuse  corner,  (b)  A  right  corner 
with  the  same  values  of  |o|  and  |$|  and  corrupted  by  the  same 
amount  of  noise. 
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|g|  are  small  with  respect  to  the  amount  of  noise  corrupting  the  segment, 
while  Fig.  3(b)  shows  a  much  more  corner-like  segment;  this  figure  has  the 
same  amount  of  noise,  but  much  larger  values  of  |a|  and  |g|.  This  effect 
is  clearly  analogous  to  the  concept  of  increased  signal  to  noise  ratio  in 
communication  theory,  in  the  sense  that  the  greater  the  corruption  the 
larger  |o|  and  jgj  would  have  to  be  to  define  a  corner-like  segment. 

If  we  view  the  process  of  digitizing  a  segment  as  a  mechanism  that 
distorts  (i.e.,  introduces  noise  to)  the  spatial  integrity  of  the  segment, 
it  is  evident  from  the  preceding  discussion  that  the  relationship  between 
the  amount  of  distortion  and  the  parameters  e,  |a|,  and  | 6 |  is  an  essential 
consideration  in  the  development  of  any  procedure  for  detecting  and  evalu¬ 
ating  corners  in  digital  segments.  The  examples  given  in  Figs.  2  and  3 
also  illustrate  the  futility  of  using  local  corner  detectors  or  curve¬ 
tracing  techniques  which  do  not  take  into  account  the  values  of  these  or 
similar  parameters  for  finding  corners  in  digital  segments. 

B.2  Corner  Detection  and  Evaluation 

The  procedure  developed  in  this  section  is  based  on  the  idea  of  utili¬ 
zing  a  model  of  an  ideal  corner  in  order  to  establish  a  measure  of  corner 
"quality"  which  takes  into  account  segment  distortion  and  the  parameters 
e,  |a|,  and  |g|  defined  in  the  previous  section.  The  following  discussion 
applies  only  to  simple,  thinned  digital  segments  (i.e.,  thinned  segments 
which  do  not  cross  themselves)  with  only  two  distinct  end  points.  Multiply- 
connected  segments  can  be  Handled  by  decomposition  into  simple  segments  at 
the  branch  points. 

With  reference  to  Fig.  4,  let  C  =  (c.a.0,0)  denote  an  ideal  (noise¬ 
less)  continuous  corner  withend  points  a  and  b,  and  denote  by  A  a  bound 
on  the  spatial  distortion  of  a  and  g  as  a  result  of  digitizing  C.  The 
parameter  a  could  be,  for  example,  a  function  of  the  variance  of  the  points 
in  a  digital  segment  referenced  to  the  straight  line  segments  a  and  6. 

In  order  to  relate  C  and  the  "broad"  corner  defined  by  the  region 
between  the  dashed  boundary  in  Fig.  4,  it  is  necessary  to  establish  a 
proportionality  factor  that  involves  A,  a,  6,  and  e.  This  can  be 
accomplished  with  the  aid  of  Fig.  5.  Let  d  be  a  straight  line  segment 


Fig.  3.  Effect  of  |a|  and  [8|  in  the  detectability  of  corners 
in  noisy  segments.  (a)  Noisy  acute  corner,  (b)  A  corner  with 
the  same  angle  and  noise,  but  with  _,arger  values  of  |a|  and 
|  8 1 . 


7!**/sin(t) 


Fig.  5.  Geometrical  arrangement  used  to  derive  a  proportion 
ality  factor  between  an  ideal  and  corresponding  distorted 
corner. 
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starting  at  c,  bisecting  0,  and  ending  at  the  intersection  of  d  with  line 
6.  The  projection  of  A  along  d  is  given  by 

A'  =  - -  (1) 

sin(^) 


As  A  increases  for  a  fixed  value  of  6,  A'  increases  and  the  area  of 
triangle  a"c"b' decreases  proportionally.  Similarly,  for  fixed  A,  a  decrease 
in  0  causes  A'  to  increase  and  the  area  of  a'c'b'to  decrease.  Clearly,  the 
smaller  this  area,  the  greater  the  difference  between  the  ideal  corner  and 
a  distorted  corner  with  bound  A.  Suppose,  however,  that  we  require  that 
the  distorted  corner  be  scaled  so  that  the  area  a'c"b'  is  equal  to  the 
area  associated  with  the  ideal  corner  (i.e.,  the  area  of  abc).  From 
elementary  trigonometry,  it  then  follows  that  the  length  of  line  d  must  be 
extended  to  |dg|  =  |d|  +  A".  Writing  this  as  a  proportion,  we  have 

.  ,  +  A' 

w"1  m  121 

or,  using  Eq.  (1 ) : 

Y  =  !  +  - a—  (3) 

|d|sin(|) 


where  y  is  the  proportionality  factor  |d  |/|d|. 


i°l  i 

By  using  the  law  of  sines,  it  follows  from  Fig.  5  that  - r- 

sin(f) 

dl  sin0  Sin6a 

and -  - -  so  that 


sine 


6 


6 


Id|  = 


61  ||e  |sine 


|<s|sin(|-) 


(4) 


I <5i  I  |o| 

Since  d  bisects  e,  we  also  have  the  relation  — r  =  — —  and,  using  the 
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